30,535 research outputs found
Cloud-based or On-device: An Empirical Study of Mobile Deep Inference
Modern mobile applications are benefiting significantly from the advancement
in deep learning, e.g., implementing real-time image recognition and
conversational system. Given a trained deep learning model, applications
usually need to perform a series of matrix operations based on the input data,
in order to infer possible output values. Because of computational complexity
and size constraints, these trained models are often hosted in the cloud. To
utilize these cloud-based models, mobile apps will have to send input data over
the network. While cloud-based deep learning can provide reasonable response
time for mobile apps, it restricts the use case scenarios, e.g. mobile apps
need to have network access. With mobile specific deep learning optimizations,
it is now possible to employ on-device inference. However, because mobile
hardware, such as GPU and memory size, can be very limited when compared to its
desktop counterpart, it is important to understand the feasibility of this new
on-device deep learning inference architecture. In this paper, we empirically
evaluate the inference performance of three Convolutional Neural Networks
(CNNs) using a benchmark Android application we developed. Our measurement and
analysis suggest that on-device inference can cost up to two orders of
magnitude greater response time and energy when compared to cloud-based
inference, and that loading model and computing probability are two performance
bottlenecks for on-device deep inferences.Comment: Accepted at The IEEE International Conference on Cloud Engineering
(IC2E) conference 201
Exploring Interpretable LSTM Neural Networks over Multi-Variable Data
For recurrent neural networks trained on time series with target and
exogenous variables, in addition to accurate prediction, it is also desired to
provide interpretable insights into the data. In this paper, we explore the
structure of LSTM recurrent neural networks to learn variable-wise hidden
states, with the aim to capture different dynamics in multi-variable time
series and distinguish the contribution of variables to the prediction. With
these variable-wise hidden states, a mixture attention mechanism is proposed to
model the generative process of the target. Then we develop associated training
methods to jointly learn network parameters, variable and temporal importance
w.r.t the prediction of the target variable. Extensive experiments on real
datasets demonstrate enhanced prediction performance by capturing the dynamics
of different variables. Meanwhile, we evaluate the interpretation results both
qualitatively and quantitatively. It exhibits the prospect as an end-to-end
framework for both forecasting and knowledge extraction over multi-variable
data.Comment: Accepted to International Conference on Machine Learning (ICML), 201
Is Simple Better? Revisiting Non-linear Matrix Factorization for Learning Incomplete Ratings
Matrix factorization techniques have been widely used as a method for
collaborative filtering for recommender systems. In recent times, different
variants of deep learning algorithms have been explored in this setting to
improve the task of making a personalized recommendation with user-item
interaction data. The idea that the mapping between the latent user or item
factors and the original features is highly nonlinear suggest that classical
matrix factorization techniques are no longer sufficient. In this paper, we
propose a multilayer nonlinear semi-nonnegative matrix factorization method,
with the motivation that user-item interactions can be modeled more accurately
using a linear combination of non-linear item features. Firstly, we learn
latent factors for representations of users and items from the designed
multilayer nonlinear Semi-NMF approach using explicit ratings. Secondly, the
architecture built is compared with deep-learning algorithms like Restricted
Boltzmann Machine and state-of-the-art Deep Matrix factorization techniques. By
using both supervised rate prediction task and unsupervised clustering in
latent item space, we demonstrate that our proposed approach achieves better
generalization ability in prediction as well as comparable representation
ability as deep matrix factorization in the clustering task.Comment: version
- …